Skip to content

Conversation

@tobil4sk
Copy link
Member

@tobil4sk tobil4sk commented Aug 23, 2022

This PR is for migrating the password hashes used by Haxelib to Argon2id instead of Md5, which would resolve #314.

Argon2 is a modern and very secure hashing algorithm. I created basic neko bindings for this implementation: https://github.com/P-H-C/phc-winner-argon2

With this PR, the proposed migration path would look like this:

  • The existing database is updated. This involves generating a salt for each user, rehashing the Md5 hash with the new algorithm, and setting the hashmethod field to "md5" for every user. This is handled in src/haxelib/server/Update.hx.

  • New users will have their passwords hashed directly with argon2id, while old users will have their password hashed with md5, and rehashed with argon2id, and then updated properly to argon2id if their credentials are correct. This means Md5 is gradually removed completely as users login, without requiring a mass password reset.

    • Support for old clients for the submit and register commands must be removed. This is because old clients hash the password client-side, which means on the server we cannot know what the original password is, which is needed to properly migrate to argon2id. This is discussed in greater detail in Stop using MD5 for passwords #314. The silver lining is that we can give a helpful error message to old clients with the cause of the error as well as the command to update haxelib install haxelib. Once we get closer to a release it might be worth also stating in this error a minimum version required to register or submit libraries. To handle this, the api version was incremented from 3.0 to 4.0.

    • It is important to note that this method of rehashing an unsalted Md5 hash with argon2id is much less secure than simply using argon2id due to password shucking, so it cannot be a permanent solution. Perhaps we might have to eventually reset passwords for inactive users which still use the legacy hashing method. However, rehashing is a reasonable transitionary step as it immediately gives a security boost to all accounts.

  • Perhaps we could make a release of an older client version as well which updates it to api version 4.0, without all the other changes currently on the main branch.

Things that definitely require feedback:

  1. Earthfile changes. I'm not 100% sure that all the changes fit the style and structure of the current file, so it would be helpful if @andyli could double check this
  2. We need to decide on exactly what parameters we want to use for hashing. To give a brief overview, these are: parallelism, time cost, memory cost. We can also change the hash length and salt length. To make this decision we need to consider what will give us good security without causing too much of a server slow down.

This allows password to be automatically rehashed if the database is old
This is so that we can drop support for submit and register for old
clients, which use 3.0
This is required to properly migrate away from md5, as old clients
performed hashing on client side.
- Passwords are now hashed on the server, with a salt.

- After the database update has run, for old accounts we are left with
their old md5 hash rehashed using argon2id, so to verify their password,
we hash first using md5 and then rehash with argon2id.
This way we can gradually remove the rehashed md5 hashes
This test results in a request to lib.haxe.org, which does not provide
api version 4.0 yet
@tobil4sk
Copy link
Member Author

tobil4sk commented Oct 7, 2022

I implemented the script to handle updating the database (i.e. rehashing the md5 hashes with argon2id). So that it only happens once, there is an extra "Meta" table which has a single record and keeps track of whether the rehashing has been done. For now, this runs in SiteDb.init, which is what also creates the database's tables if they do not exist yet.

However, one thing that is missing is actually altering the tables for the already running database. It would be ideal if we could handle this automatically, that way it is easier for anyone running their own instance of haxelib to update as well. In order to alter the tables, we need to run skeema push somewhere, which will update the tables in line with the changes to the schema files in the skeema directory. I'm not sure where to put this command. Ideally, it should run only once, whenever the server is restarted after an update. We could also move the script for updating the hashes wherever we end up putting this extra command, to avoid redundant checks.

For the algorithm parameters, I think it is fine to use the default ones for now. If we need to adjust them in the future, with this system implemented like this, it won't be hard to change. The important thing is to move away from MD5 as soon as possible.

@tobil4sk
Copy link
Member Author

tobil4sk commented Oct 7, 2022

We also need record-macros to be updated for the tests to build: HaxeFoundation/record-macros#62

@tobil4sk
Copy link
Member Author

tobil4sk commented Oct 19, 2022

@andyli Hi, do you have any idea where the best place to run skeema push would be, so the table schemas can be automatically updated?

EDIT: For now I've changed the PR to handle the table updates manually instead of via skeema push.

This will be done via sql statements for now, because I'm not sure
how to run `skeema push` automatically.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stop using MD5 for passwords

1 participant